IStore: Towards High Efficiency, Performance, and Reliability in Distributed Data Storage with Information Dispersal Algorithms

نویسندگان

  • Corentin Debains
  • Pedro Alvarez-Tabio
  • Dongfang Zhao
  • Kent Burlingame
  • Ioan Raicu
چکیده

Reliability is one of the major challenges for high performance computing and cloud computing. Data replication is a commonly used mechanism to achieve high reliability. Unfortunately, it has a low storage efficiency among other shortcomings. As an alternative to data replication, information dispersal algorithms offer higher storage efficiency, but at the cost of being too computing-intensive for today’s modern processors. This paper explores the possibility of utilizing erasure coding (a form of information dispersal algorithms) for data redundancy while accelerating its operation with GPUs. We evaluate the performance improvements of the erasure coding from the CPU to the GPU, showing a 10X higher throughput for the GPU. With this promising result, we design and implement a distributed data store with erasure coding, called IStore, to demonstrate that erasure coding could serve as a better alternative to traditional data replication in distributed file systems. A performance evaluation is performed on a 32-node cluster to evaluate the scalability and parameter sensitivity of IStore.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Efficient Keyless Fragmentation Algorithm for Data Protection

The family of Information Dispersal Algorithms is applied to distributed systems for secure and reliable storage and transmission. In comparison with perfect secret sharing it achieves a significantly smaller memory overhead and better performance, but provides only incremental confidentiality. Therefore, even if it is not possible to explicitly reconstruct data from less than the required amou...

متن کامل

E2DR: Energy Efficient Data Replication in Data Grid

Abstract— Data grids are an important branch of gird computing which provide mechanisms for the management of large volumes of distributed data. Energy efficiency has recently emerged as a hot topic in large distributed systems. The development of computing systems is traditionally focused on performance improvements driven by the demand of client's applications in scientific and business domai...

متن کامل

Distributed and Cooperative Compressive Sensing Recovery Algorithm for Wireless Sensor Networks with Bi-directional Incremental Topology

Recently, the problem of compressive sensing (CS) has attracted lots of attention in the area of signal processing. So, much of the research in this field is being carried out in this issue. One of the applications where CS could be used is wireless sensor networks (WSNs). The structure of WSNs consists of many low power wireless sensors. This requires that any improved algorithm for this appli...

متن کامل

Multicast Routing in Wireless Sensor Networks: A Distributed Reinforcement Learning Approach

Wireless Sensor Networks (WSNs) are consist of independent distributed sensors with storing, processing, sensing and communication capabilities to monitor physical or environmental conditions. There are number of challenges in WSNs because of limitation of battery power, communications, computation and storage space. In the recent years, computational intelligence approaches such as evolutionar...

متن کامل

Comparison of the Efficiency of Data Mining Algorithms in Predicting the Diagnosis of Diabetes

Background: Diabetes is one of the major health problems in Iran and about 4.6 million adults suffer from this disease. Poor diagnosis of this disease has caused half of this number to be unaware of their disease. In recent years, along with the use of computers in data analysis and storage, the volume and complexity of data has increased dramatically. Methods: In health organizations, data pl...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013